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A  FORMAL  DESCRIPTION  OF  A  SUBSET  OF  ALGOL 


by  John  McCarthy 


Abstract:  We  describe  Microalgol,  a  trivial  subset 
of  Algol,  by  means  of  an  interpreter. 

The  notions  of  abstract  syntax  and  of 
"state  of  the  computation"  permit  a  compact 
description  of  both  syntax  and  semantics. 

We  advocate  an  extension  of  this  technique 
as  a  general  way  of  describing  programming 
languages . 
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A  FORMAL  DESCRIPTION  OF  A  SUBSET  OF  ALGOL 


by  John  McCarthy 


1.  Introduction: 


In  ray  paper  Towards  a  Mathematical  Science  of  Computation, 
Proceedings  of  the  ICIP,  1962,  I  advocated  defining  programming 
languages  in  the  following  way: 


1  -  Give  the  syntax  in  an  abstract  analytic  form,  i,e.  for  each 
type  of  expression  name  the  predicates  for  telling  how  it  is  composed 
and  for  getting  its  parts. 


2  -  The  abstract  syntax  makes  no  commitments  about  how  sums, 
products,  etc.  are  actually  represented  by  symbolic  expressions.  To 
define  a  concrete  syntax  one  represents  the  abstract  syntactic  predicates 
and  functions  by  functions  of  strings. 


3  -  Next  one  defines  what  information  is  included  in  describing 
the  state  of  the  computation,  e.g.  this  includes  the  values  currently 
assigned  to  the  program  variables. 


4  -  Then  one  describes  the  semantics  of  the  language  by  defining  a 
function  £'  =  lang  (  «,  £)  that  gives  the  state  £'  that  results  from 
applying  the  program  n  to  the  state  f. 


Our  object  in  this  paper  is  to  carry  this  procedure  out  for  a 
very  small  subset  of  Algo?  called  Micrcalgol.  This  will  illustrate 
the  method  in  an  easy  case;  all  the  difficult  aspects  of  Algol  are 
eliminated. 


2.  Informal  Description  of  Microalgol: 


Microalgol  is  a  language  for  programming  about,  not  for  programming 
in.  It  has  no  declarations  and  no  arrays,  and  the  only  statements  are 
assignments  and  conditional  go _to  ' s  of  the  form  if  p  then  go  to  a. 


In  forming  the  right  sides  of  assignment  statements  one  may  use 
sums,  products,  differences,  quotients  and  conditional  expressions 
involving  the  relational  operators  =  and  <.  All  arithmetic  operators 
take  two  operands.  Here  is  an  example  of  a  Microalgol  program 


.3 


root :  =  1  ; 

a:  root:  =0.5  x(rcot  +  x/root)  ; 
error:  =  root  X  root  -x; 

perror:  =  if  error  >0.0  then  error  else  0.0  -  error; 
if  perror  >  .00001  then  go  to  a; 


Abstract  Analystic  Syntax  of  Microalgol: 


We  shall  first  give  the  abstract  analytic  syntax  of  the  terms  that 
can  appear  on  the  right  sides  of  assignment  statements.  It  is  given 
by  the  following  table: 


predicate 

associated  functions 

examples 

isvar  (t) 

X 

isconst(t ) 

val  ( t  ) 

N.B.  This  is  a  semantic 
function 

.001 

issum  (t) 

addend  (t) 

augend  (t) 

a  +  x  x  y 

isdiff  (t) 

subtrahend  (t) 

minuend  (t) 

a  -  x  X  y 

isprod  (t) 

multiplier  (t) 

multiplicand  (t) 

x  X  ( a+b ) 

alquotient  (t) 

numerator  (t) 

denominator  (t) 

x/ root 

iscond  (x) 

proposition  (t) 

antecedent^  )consequent(T 

)  if  x  <  3  then 
else  2 

isequai  (t) 

lefteq  (t) 

righteq  (t) 

x  =  3 

isless  (t) 

lef 1  (t) 

rightl  (t) 

x  <  3 

The  idea,  taken  from  the  ICIP  paper  is  that  any  term  is  of  one  of 
the  eight  types  and  that  the  predicates  enable  us  to  tell  which.  Once 
the  type  is  decided  the  syntactic  functions  associated  with  that  type 
are  defined  and  give  us  the  parts  of  the  expression.  Thus  if  t  is 
a  t  x  x  y  then  issum  (t)  is  true  and  addend  (t)  is  a  and  augend  (t)  is 
x  X  y. 

Now  we  give  the  abstract  syntax  of  Microalgol  statements: 


predicate 

associated  functions 

examples 

assignment  (s) 

left  (s)  right  (s) 

s  is  "root:=0. 5 x(root+x/root)" 

left(s)  is  "root" 

right  (s)  is  "0.5 X(root+x/root)" 

goto  (s) 

proposition( s )  destination( s ) 

s  is  "if  perror>  .00001 
then  go  to  a" 

proposition  (s)  is  "perror  > 

. 00001" 

destination  (s)  is  "a" 

Finally,  we  give  the  abstract  syntax  of  Microalgol  programs: 

1.  If  it  is  a  program  and  T]  is  a  statement  number  then  statement 
(it,  t)  is  the  T]th  statement  of  the  program.  Thus  for  the  program  we 
have  been  using  as  an  example,  statement  (it,  3)  is 
"error:  =  root  X  root  -  x". 


2.  If  £  is  a  label  than  numb  (  £,  it)  is  the  statement  number  corres¬ 
ponding  to  the  label  if  there  is  one.  Thus,  in  our  program,  numb  (a,  it) 
is  2. 

3.  The  predicate  end  (it,  t|)  is  true  if  there  is  no  t]th  statement. 
Thus  end  (it, 6)  is  true  in  our  example. 

4.  The  States  of  Microalgol 

The  state  of  a  Microalgol computation  is  given  by  a  state  vector  £ 
which  tells  us  the  value  currently  assigned  to  each  variable  and  also  the 
statement  number  about  to  be  executed.  We  shall  treat  the  statement 
number  as  a  pseudo-variable  called  sn. 

Associated  with  state  vectors  are  two  functions: 

1.  c(var;|)  gives  the  value  assigned  to  the  variable  var  in  state 


2.  a(var,  value,  £)  gives  the  new  state  that  results  from  the  state 
|  when  the  number  value  is  assigned  to  the  variable  var. 


Some  of  the  properties  of  state  vectors  are  given  in  the  ICIP 
paper. 

The  values  of  Microalgol  variable  are  real  numbers.  Of  course, 
only  small  integers  can  ever  turn  up  as  values  of  sn. 
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The  Semantics  of  Microalgol 
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The  semantics  of  Microalgol  is  given  by  a  recursively  defined 
function  micro  (it,  g)  that  gives  the  state  in  which  a  Microalgol 
program  it  will  terminate  if  it  is  entered  in  state  g.  First,  however, 
we  give  the  function  value  (t  ,  g)  that  tc-lls  us  the  value  of  a  term. 

We  have 

value  (t  ,  |)  =  if  isvar  (t)  then  c(  x,0 
else  if  isconst  (t)  then  val  (t) 

else  if  issum  (t)  then  value  (addend  (t),  g)  +  value  (augend  (x),  l) 
else  if  isdiff  (x)  then  value  (subtrahend  (x),  g)  -value  (minuend  (x),  g) 

else  if  isprod  (t)  then  value  (multiplier  (t),  |)  x  value  (multiplicand  (t),  g) 

else  if  isquotient  (t)  then  value  (numerator  (t),  g)/value  ( denominator ( t ), g ) 

else  if  iscond  (t)  then  (if  value  (proposition  (t),  g)  then 

value( antecedent  (t),  g)  else  value  (consequent  (t),  g)) 
else  if  isequal  (t)  then  ( value( lefteq  ( t) ,  g )  =  value  (righteq  (t),  g)) 
else  if  isless  (t)  then  (value  (leftl  (t),  g)  <  value  (rightl(t),  g)) 

For  definiteness  we  shall  assume  that  the  arithmetic  in  Microalgol  as 
expressed  by  the  operators  +  -  X  /  =  <  is  real  number  arithmetic.  Little 
would  be  changed,  however,  if  we  restricted  the  values  of  variables  to 
numbers  represented  in  some  machine  and  meant  by  the  operators  the 
operations  of  the  machine. 

We  now  can  write 

micro  (it  ,  g)  =  (^  n  .if  end  (it  ,  n  )  then  g 
else  (\  s  .  if  assignment- ("s )  then 

micro(it,  a(sn,  n  +  1,  a(left(s),  value  ( right ( s ),g ),g ) ) ) 
else  if  goto  (s)  then 

micro  (it,  a(sn,  if  value  (proposition  (s),  g)  then 

numb  (destination  (s),it)  else  n  +  1,  g))) 

(statement  (n,it))  (c  (sn,|)) 

This  completes  the  description  of  abstract  Microalgol.  In  order 
to  describe  a  concrete  Micro-algol  it  is  only  necessary  to  represent 
the  abstract  sytactic  predicates  and  functions  by  predicates  and 
functions  on  strings. 

5.  Two  Concrete  Realizations  of  Abstract  Microalgol 


We  shall  present  two  realizations  of  Microalgol.  This  first  is  a 
Lisp  S-expression  realization  suitable  for  use  inside  a  machine  and 
the  second  corresponds  to  the  concrete  syntax  of  ALGOL  60. 

A  LISP  realization  called  IMA. 


We  use  LISP  expressions  for  the  terms  as  follows: 


a.  atoms  for  variables 

b.  Lisp  numbers  for  constants 


c.  (PLUS  a  b)  for  sums 

d.  (DIFF  a  p) 

e.  (TIMES  a  p) 

f.  (RATIO  a  p) 

g.  (IF  a  p  7  ) 

h.  (EQUALS  a  p) 

i.  (less  a  p ) 

j.  (ASSIGN  a  p) 

k.  (GO  a  p) 

2.  We  represent  a  program  by  a  list  of  its  statements  leaving  a 
place  for  the  label  that  is  left  blank  if  there  is  no  label.  The 
syntactic  functions  are  as  follows: 

a.  isvar  (t)  =  atom  [t]  A  ^  numberp  [t] 

b.  isconst  (t)  =  r.umberp  [t] 

c.  issum  (t)  =  eq  [car  [t]  j  PLUS] 

addend  ( t )  =  r  adr  [ t ] 
augend  (t)  =  caddr  [t] 

d.  isprod  (t)  =  eq  [car  [ x ] ;  TIMES] 

multiplier  (t)  =  cadr  [t] 
multiplicand  (t)  =  caddr  [tJ 

We  omit  the  obvious  for  isdiff,  isquot,  iscond,  iseq,  isless, 
assign,  goto. 

statement( n,  it )  =  if  n  =  1  then  cadar  [it]  else  statement  [n-l;cdr[ir] ] 
numb  (a,  it)  =  if  eq  [car  [it] ;  a] then  1  else  1  +  numb  [a,cdr  [it]  ] 
end  (it,  q)  =  null  [itj  v  [l  >1  A  end  [cdr  fit];  T) -1  ]  ] 

6.  A  Standard  Realization 

In  order  to  describe  a  realization  of  Microalgol  that  corresponds 
to  ALGOL  60  we  need  to  compute  with  strings  of  symbols.  For  this 
purpose  we  shall  use  the  linear  LISP  of  [3  ]  and  we  only  give  a  few  of  the 
syntactic  predicates , namely , statement  (tj,  n),  issum  (t),  addend  (T) 
and  isprod  (t). 

statement  (tj,  it)  =  if  q  =  1  then  delim  (":  ",  it)  else  statement  (tj-1, 
strip  (";.",  it)  ) 

delim  (a,  it)  =  if  first  (it)  =  a  then  A.  else  prefix  (first  (it),  delim 
(a,  rest  (it))) 

strip  (a,  it)  =  if  first  (it)  =  a  then  rest  (it)  else  strip  (a,  rest  (it)) 
issum  (t)  =  isop  (t,  0,"+") 
isop  (t,  q,  a)  =  if  null  (T)  then  F  else  if 
first  (n)  =  "r7  then  isop- ( 

rest  (t),  q+l,a)  else  if  first  (it)  =  ")"  then  isop  (rest  (T),  q-l,a) 


else  if  T)  >  0  then  isop  (rest  (t),  q,a)  else  if  first  (r)  =  a  then  T 
else  isop  (rest  (t),  q,a) 

augend  (x)  =  deparen(  delim  1  (  0,  x )) 


delim  1  (a,  T],  x)  =  if  first  (t)  =  "("  then  (prefix  ( 
first  (x),  delim  1  (a,  t]  +  i  ,  rest  (x))))  else  if  first  (x)  =  ")" 
then  prefix  (first  (x),  delim  1  (a,  r|-  l,  rest  (x)))  else  if 
q  >  0  then  prefix  (  first  (x),  delim  1  (a,  q  ,  rest  (x5T 
ei_bC  if  first  (x)  =  a  then  A.  else 
prefix  (first  (x),  delim  1  (a,  tj ,  rest  (x))) 
deparen  (x)  =  if  — |  first  "("T )  =  "("  then  x  else 

dep  1  (rest  (x^J 

dep  1  (x)  =  if  <-i  null  (rest  ( x ))  then  prefix  (first  (x), 
dep  1  (rest  (x)))  else  _A 

isprod  (x)  =  — i  issum  (x ) A  — r  isdiff  (x)  A isop  (x  ,  0,  "x") 

7.  What  about  ALGOL 

The  semantics  of  Microalgol  was  described  entirely  by  the  two 
formulas  of  section  5 •  Algol  is  considerably  more  complicated,  but 
it  will  be  relatively  easy  to  write  down  the  functions  once  we  have 
decided  what  goes  into  the  state.  The  following  complications  arise. 

1.  We  have  to  be  able  to  describe  the  situation  in  which  a  term 
is  partially  evaluated  in  order  to  describe  the  state  during  the 
execution  of  a  type  procedure. 

2.  The  chain  of  procedure  entries  must  be  described  including  the 
recursive  entries. 

3.  The  current  declarations  and  those  associated  with  higher  levels 
of  recursion  must  be  included. 

Call-by-name  parameters  require  the  association  of  expressions 
to  be  evaluated. 

5.  etc. 

I  believe  that  those  difficulties  can  be  resolved  and  that  a  clear 
description  of  the  state  of  an  Algol  computation  will  clarify  the  problem 
of  compiler  design. 

8.  Comparison  with  other  ways  of  Describing  Semantics 

We  believe  that  the  description  of  programming  languages  by  abstract 
syntax  and  state  transformation  functions  has  the  following  advantages. 

1.  Questions  of  notation  are  separated  from  semantic  questions  and 
postponed  until  the  concrete  syntax  has  to  be  defined. 


2.  Our  intuitive  idea  of  what  happens  when  a  statement  is  executed 
is  described  by  its  effect  on  the  state. 


3.  This  technique  will  lead  to  the  most  concise  and  understandable 
descriptions. 

4.  This  notion  of  semantics  corresponds  to  the  notions  of  Tarski 
etc.  that  are  current  in  mathematical  logic.  I  believe  that  describing 
languages  this  way  will  lead  to  the  possibility  of  proving  theorems  about 
compilers.  (See  the  notion  of  correctness  of  a  compiler  presented  in 
the  ICIP  paper). 

It  seems  to  me  that  there  are  two  other  approaches  to  the  problem 
incorporated  in  various  ways  in  various  papers.  One  of  these  ideas  is 
to  regard  the  Algol  data  itself  as  strings  of  symbols  and  the  state 
as  a  giant  string.  In  my  opinion  this  gets  a  long  way  from  the  intuitive 
ideas  of  Algol  and  enforces  decisions  in  areas  in  which  we  want  to 
remain  uncommitted  if  only  because  of  differences  among  machines, 

A  second  approach  is  to  define  ALGOL  by  a  compiler,  either  into 
a  machine  language, an  intuitive  subset  of  ALGOL  or  into  an  abstract 
system  such  as  X  -  calculus.  These  definitions  have  a  certain 
practical  value  in  resolving  ambiguities,  but  as  they  do  not  correspond 
to  our  intuitive  ideas,  they  will  make  mathematical  results  difficult 
to  obtain  and  leave  us  with  the  problem  of  semantics  of  the  target 
language . 

It  has  been  argued  that  since  the  formalism  used  above  is  a 
language,  all  semantic  descriptions  are  circular;  one  might  as  well 
explain  Algol  by  examples,  etc.  rather  than  ignore  the  difficulties. 

In  one  sense  this  objection  is  unanswerable.  Nothing  can  be 
explained  to  a  stone;  the  reader  must  understand  something  before  hand. 
The  same  objection  was  raised  against  Tarski's  efforts  to  describe 
the  semantics  of  mathematical  logic  which  have  proved  very  successful 
and  fruitful. 

These  are  two  answers;  a  practical  answer  and  a  fundamental 
answer: 

1.  The  formalism  I  used  is  simpler  than  Algol  and  will  lead 

to  understanding.  The  same  advantage  can  be  claimed  for  translations 
into  simple  languages. 

2.  The  fundamental  answer  is  this.  The  purpose  of  semantics  is 
to  describe  the  relation  between  the  form  of  an  expression  and  what 
it  stands  for.  From  such  relations  follow  other  properties  of 

the  example;  we  can  define  equivalence  of  Microalgol  programs  in 
terms  of  how  states  are  transformed  and  show  that  certain  changes 
in  a  program  preserve  equivalence. 


